Towards Audio-Visual On-line Diarization Of Participants In Group Meetings
نویسندگان
چکیده
We propose a fully automated, unsupervised, and non-intrusive method of identifying the current speaker audio-visually in a group conversation. This is achieved without specialized hardware, user interaction, or prior assignment of microphones to participants. Speakers are identified acoustically using a novel on-line speaker diarization approach. The output is then used to find the corresponding person in a four-camera video stream by approximating individual activity with computationally efficient features. We present results showing the robustness of the association on over 4.5 hours of non-scripted audio-visual meeting data.
منابع مشابه
Speech/Non-Speech Detection in Meetings from Automatically Extracted low Resolution Visual Features
In this paper we address the problem of estimating who is speaking from automatically extracted low resolution visual cues in group meetings. Traditionally, the task of speech/non-speech detection or speaker diarization tries to nd “who speaks and when” from audio features only. In this paper, we investigate more systematically how speaking status can be estimated from low resolution video We e...
متن کاملMultimodal Speaker Diarization Utilizing Face Clustering Information
Multimodal clustering/diarization tries to answer the question ”who spoke when” by using audio and visual information. Diarization consists of two steps, at first segmentation of the audio information and detection of the speech segments and then clustering of the speech segments to group the speakers. This task has been mainly studied on audiovisual data from meetings, news broadcasts or talk ...
متن کاملAudio Segmentation for Meetings Speech Processing
Audio Segmentation for Meetings Speech Processing by Kofi Agyeman Boakye Doctor of Philosophy in Engineering—Electrical Engineering and Computer Sciences University of California, Berkeley Professor Nelson Morgan, Chair Perhaps more than any other domain, meetings represent a rich source of content for spoken language research and technology. Two common (and complementary) forms of meeting spee...
متن کاملAnalysis of the No Return Point Hypothesis: The Effect of Audio and Visual Stimuli in the Fast Movements Inhibition
Background. The No Return Point hypothesis is one of the research areas that has been done in line with the motor program. In this hypothesis emphasized an inability to inhibition move after its start by the motor program. Several factors are affecting the mechanism of this inhibition. Objectives. In this study, we investigate the effects of audio and visual stimuli on blocking quick moves to ...
متن کاملEffectiveness of Media Intervention on Students' Attitudes toward Drug and Tobacco: Based on Health Education and Legal Consequences
Background and purpose: Drug and tobacco addiction is one of the major threats to adolescents and educating this group could be of great benefit in preventing the problem. The purpose of this study was to investigate the effectiveness of media intervention on students' attitude towards drug and tobacco use. Materials and methods: In this quasi-experimental research a male high school was ra...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008